Apache for Beginners

Way back when, in the wilds of 1995, there were a great many people who were disgruntled with the state of Web servers. The commercial ones, like Microsoft’s IIS (Internet Information Server) and Netscape’s family of servers, hadn’t been born yet, and the ones put out by college students – well, they sucked.

But lo! What did the early code jockeys do? They made their own damn Web server. They called it Apache (as in a patchy server, because it had a lot of patches). A patch is just what it sounds like – something to plug holes in your code with. This small group of hackers started a project that would eventually create the most popular Web server software in the world.

Not to give ourselves too much credit, but one of the founders of the Apache project was an engineer at HotWired. Don’t you just love us? If you really want to know more about Apache’s history, there’s a nice narrative on its site.

The brilliance of the Apache group’s scheme lay not just in the fine programming, but in the development model it used. Now it is fashionably called open source.

(A small side note:There are several different flavors of open-source development. Apache’s lets anyone create a commercial product based on its code and doesn’t make them share the results if they don’t want to. If I say this model is “better” than the other schemes, hostile email will no doubt follow this article’s publication. But it may well be.)

So … back to the present.

Why should you care? There’s two reasons:

It’s free. It rocks.

If you want to set up a Web site, there are lots of advantages to having the source code.

Another big advantage to the open-source approach is that Apache has attracted lots of developers around the world. They have made blocks of code known in Apache-land as modules. Many of these modules do things you want. And if you can’t find one you like, you can always write your own. (You’ll need to write them in C, or you could use mod_perl to extend the server in Perl … but I’m drifting.)

Downloading Apache

If you’ve landed on this page, you probably will want to know how to install and configure Apache for yourself. You may need a compiler, but don’t let that scare you; there are lots of precompiled versions of Apache now too. You just need to know where to get them (www.apache.org/dist/).

You’ll also need to know what kind of operating system you want to use (my fave for Apache is Linux, but that’s the start of another holy war). For our present purposes, I’ll assume that we’re going to compile Apache with the GNU C compiler in a Unix-based OS.

One neat-o new thing in the latest version of Apache (1.3.x) is something called DSO (dynamic shared object) support. What this means is that you don’t need to precompile everything into your Web server; just add what you need as you need it. I’ll admit to being kind of old school about this sort of thing. Using DSO may not be as stable as compiling all of the modules yourself, so we’ll do the latter in this tutorial. Your mileage may vary.

Here’s a popular way to unpack the distribution. In this example, the file I download is apache_1.3.6.tar.gz, and I put it in the /tmp directory:

cooke@mymachine:/tmp%tar zxvf apache_1.3.6.tar.gz 

This creates a directory called apache_1.3.6 that has all of the source files in it. Neat!

Compiling Apache

Once you’ve successfully unpacked the distribution, compiling Apache is a snap! (Really, it’s not that bad.)

For those of you who have never touched a compiler before, this is what a compiler does:It takes stuff written in a programming language (C in this case) and makes a binary file (or set of files) that your computer can understand natively as zeroes and ones.

The most popular compiler for Apache is (not surprisingly) a free one:GNU cc (gcc for short). If you’re on a Unix system, you can find out what compilers you have installed with the handy which command, as in:

cooke@mymachine:/tmp%which cc

 /usr/local/bin/cc 

To find out what version you have (if you have gcc), run it with the -v switch, as in:

cooke@mymachine:/tmp%gcc -v Reading specs from /usr/local/lib/gcc-lib/sparc-sun-solaris2.6/2.8.1/specs gcc version 2.8.1 

This version is the latest and greatest – or close to that. It’s good to have the latest version, though it’s not strictly necessary.

If you have another compiler (usually called cc), it will probably work just fine. Try it and see!

(I did a little test here when upgrading to the latest version of Apache, and gcc was actually faster than our OS vendor’s “optimized” compiler. Go figure!)

The next step is to configure and compile the beastie. There are two different configuration steps. The first indicates how the binary file of the Web server should be compiled; the second step configures the operations of the compiled binary and changes its settings. Think of it this way:The first step makes the program, and the second step tells the program what to do.

The easiest way to make sure everything works with your compiler and operating environment is to do something like this:

cooke@mymachine:/tmp%cd apache_1.3.4 cooke@mymachine:/tmp/apache_1.3.4%./configure ; make 

You should see output like this:

Configuring for Apache, Version 1.3.4



 + Warning:Configuring Apache with default settings.



 + This is probably not what you really want.



 + Please read the README.configure and INSTALL files



 + first or at least run './configure --help' for



 + a compact summary of available options.



 + using installation path layout:Apache (config.layout)



Creating Makefile



Creating Configuration.apaci in src



Creating Makefile in src



 + configured for Solaris 260 platform



 + setting C compiler to gcc



 + setting C pre-processor to gcc -E



 + checking for system header files



 + adding selected modules



 + doing sanity check on compiler and options



Creating Makefile in src/support



Creating Makefile in src/main



Creating Makefile in src/ap



Creating Makefile in src/regex



Creating Makefile in src/os/unix



Creating Makefile in src/modules/standard



===> src



===> src/os/unix



gcc -c  -I../../os/unix -I../../include   -DSOLARIS2=260 `../../apaci` os.c



gcc -c  -I../../os/unix -I../../include   -DSOLARIS2=260 `../../apaci` os-inline.c



rm -f libos.a



ar cr libos.a os.o os-inline.o



ranlib libos.a



<=== src/os/unix

You’ll see a whole lot more like this till it’s done.

If it fails with a bad-sounding error message, you might want to try downloading the precompiled binary. Or if you’re on Linux, get the RPM (RedHat Package Manager) version for it.

Once the compilation is done, test the binary to make sure it works. I usually do something like this:

cooke@mymachine:/tmp/apache_1.3.4% cd src cooke@mymachine:/tmp/apache_1.3.4/src% ./httpd -l 

Which returns:

Compiled-in modules:



  http_core.c



  mod_env.c



  mod_log_config.c



  mod_mime.c



  mod_negotiation.c



  mod_status.c



  mod_include.c



  mod_autoindex.c



  mod_dir.c



  mod_cgi.c



  mod_asis.c



  mod_imap.c



  mod_actions.c



  mod_userdir.c



  mod_alias.c



  mod_access.c



  mod_auth.c



  mod_setenvif.c

The httpd -1 switch lists the modules that have been compiled into Apache without starting up the program.

Apache assumes that you want it’s root directory to be /usr/local/apache/ (called the ServerRoot) and that the configuration file used to start the server will be /usr/local/apache/conf/httpd.conf. If that’s all OK with you, go ahead and copy the httpd file to /usr/local/apache/bin/ (after making that directory, of course!).

Now let’s see what’s in the sample configuration files!

(You’re almost there – trust me.)

Configuring Apache

Now that you have a working binary (you can tell by using one of the command-line switches, like ./httpd -l), it’s time to configure it. This is the biggest part of the job sometimes, and there are so many options available that it may be worth buying a book, like the recently revised O’Reilly book on Apache, which is pretty good. But if you’re like me, you’ll just mess with the configuration and let the book sit on your shelf until you run into some problem you can’t figure out.

The basic steps are pretty easy to lay out:

See that httpd.conf-dist and all of the other distribution .conf files in the /conf/ subdirectory lose the -dist suffix. All of the directives (Apache commands) are listed on Apache’s Web site, which is the best source for documentation. And many people can just use the standard versions of the .conf files and be perfectly happy. If you run ./httpd -h, you’ll see a complete list of directives that the binary supports (this neat trick helps if you forget what something is called).

You’ll also need to specify the username that Apache uses. (“Nobody” is the canonical standard, but you can use anything you want.) Make sure that the username you choose has permissions to do things on your server, like read, write, and execute appropriate files and directories. Get the permissions issues ironed out early because they can and do bite you in the butt later.

You’ll soon want to do things like access control for certain documents or directories, match different HTML files for different browsers, or do funky server-side rewriting of URLs for your own funky reasons. Get down with your bad self! Apache has an answer for your troubles. Sometimes the answer is easy, and sometimes it is hard. I never promised you a rose garden.

But using Apache opens the way to lots of other cool open-source projects like PHP, a great scripting interface to databases and sort of a competitor to ASP, and mod_perl, a way to embed Perl programs into your binary and make them bitchin’ fast.

There are a few ways to run the thing once it’s compiled and configured. Most people just type /usr/local/apache/src/httpd, and Apache happily chugs along – if you’ve configured it correctly. If you haven’t, you may see an error at the command line with the configuration directive you’ve botched or you may see an error in the error_log, which lives by default in /usr/local/apache/logs.

As in all things, your mileage may vary. But Apache does pretty well. There are many reasons it’s the most popular Web server in the world besides the fact that it’s free.